interactive topic modeling and alignment
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
Multilingual topic models can reveal patterns in cross-lingual document collections. However, existing models lack speed and interactivity, which prevents adoption in everyday corpora exploration or quick moving situations (e.g., natural disasters, political instability). First, we propose a multilingual anchoring algorithm that builds an anchor-based topic model for documents in different languages. Then, we incorporate interactivity to develop MTAnchor (Multilingual Topic Anchors), a system that allows users to refine the topic model. We test our algorithms on labeled English, Chinese, and Sinhalese documents. Within minutes, our methods can produce interpretable topics that are useful for specific classification tasks.
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
Multilingual topic models can reveal patterns in cross-lingual document collections. However, existing models lack speed and interactivity, which prevents adoption in everyday corpora exploration or quick moving situations (e.g., natural disasters, political instability). First, we propose a multilingual anchoring algorithm that builds an anchor-based topic model for documents in different languages. Then, we incorporate interactivity to develop MTAnchor (Multilingual Topic Anchors), a system that allows users to refine the topic model. We test our algorithms on labeled English, Chinese, and Sinhalese documents. Within minutes, our methods can produce interpretable topics that are useful for specific classification tasks.
Reviews: Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
The paper proposes an anchor-based approach for learning a topic model in multilingual settings. The model first is built automatically and then can be refined by interaction with a user. The paper addresses a problem of interest of the previous NIPS submissions. Quality: The paper is technically sound. I like that for bilingual topic modeling it is only require to have a dictionary between two languages but documents should not be aligned for training a topic model.
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
Yuan, Michelle, Durme, Benjamin Van, Ying, Jordan L.
Multilingual topic models can reveal patterns in cross-lingual document collections. However, existing models lack speed and interactivity, which prevents adoption in everyday corpora exploration or quick moving situations (e.g., natural disasters, political instability). First, we propose a multilingual anchoring algorithm that builds an anchor-based topic model for documents in different languages. Then, we incorporate interactivity to develop MTAnchor (Multilingual Topic Anchors), a system that allows users to refine the topic model. We test our algorithms on labeled English, Chinese, and Sinhalese documents.